Molecular Embedding via a Second Order Dissimilarity Parameterized Approach
نویسندگان
چکیده
We describe a computational approach to the embedding problem in structural molecular biology. The approach is based on a dissimilarity parameterization of the problem that leads to a large-scale nonconvex bound constrained matrix optimization problem. The underlying idea is that an increased number of independent variables decouples the complicated effects of varying the location of individual atoms in coordinate-based formulations. Numerical tests support this hypothesis and indicate that the optimization problem that results is relatively benign and easy to solve, despite being large and nonconvex. We can solve problems with millions of independent variables in a few dozen to a few score optimization iterations. The nonconvexity arises due to matrix rank constraints in the problem, and we focus on their efficient computational treatment. We present numerical results for a number of synthetic and real protein data sets and comment on features of real experimental data that can cause computational difficulties.
منابع مشابه
Perception-based Visualization of High-Dimensional Medical Images Using Distance Preserving Dimensionality Reduction
A method for visualizing high dimensional medical image data is proposed. The method operates on images in which each pixel contains a high dimensional vector, e.g. a time activity curve (TAC) in a dynamic positron emission tomography (dPET) image, or a tensor, as is the case in diffusion tensor magnetic resonance images (DTMRI). A nonlinear mapping reduces the dimensionality of the data to ach...
متن کاملEmbedding of a 2D Graphene System in Non-Commutative Space
The BFT approach is used to formulate the electronic states in graphene through a non-commutative space in the presence of a constant magnetic field B for the first time. In this regard, we introduce a second class of constrained system, which is not gauge symmetric but by applying BFT method and extending phase space, the second class constraints converts to the first class constraints so th...
متن کاملRicci flow embedding for rectifying non-Euclidean dissimilarity data
Pairwise dissimilarity representations are frequently used as an alternative to feature vectors in pattern recognition. One of the problems encountered in the analysis of such data, is that the dissimilarities are rarely Euclidean, while statistical learning algorithms often rely on Euclidean dissimilarities. Such non-Euclidean dissimilarities are often corrected or a consistent Euclidean geome...
متن کاملConcept drift detection in business process logs using deep learning
Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...
متن کاملA Biologically Consistent Model for Comparing Molecular Phylogenies
In the framework of the problem of combining different gene trees into a unique species phylogeny, a model for duplication/speciation/loss events along the evolutionary tree is introduced. The model is employed for embedding a phylogeny tree into another one via the so-called duplication/speciation principle requiring that the gene duplicated evolves in such a way that any of the contemporary s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- SIAM J. Scientific Computing
دوره 31 شماره
صفحات -
تاریخ انتشار 2009